Catastrophic health expenditure during the COVID-19 pandemic in five countries: a time-series analysis

Summary Background The COVID-19 pandemic disrupted health systems in 2020, but it is unclear how financial hardship due to out-of-pocket (OOP) health-care costs was affected. We analysed catastrophic health expenditure (CHE) in 2020 in five countries with available household expenditure data: Belarus, Mexico, Peru, Russia, and Viet Nam. In Mexico and Peru, we also conducted an analysis of drivers of change in CHE in 2020 using publicly available data. Methods In this time-series analysis, we defined CHE as when OOP health-care spending exceeds 10% of consumption expenditure. Data for 2004–20 were obtained from individual and household level survey microdata (available for Mexico and Peru only), and tabulated data from the National Statistical Committee of Belarus and the World Bank Health Equity and Financial Protection Indicator database (for Viet Nam and Russia). We compared 2020 CHE with the CHE predicted from historical trends using an ensemble model. This method was also used to assess drivers of CHE: insurance coverage, OOP expenditure, and consumption expenditure. Interrupted time-series analysis was used to investigate the role of stay-at-home orders in March, 2020 in changes in health-care use and sector (ie, private vs public). Findings In Mexico, CHE increased to 5·6% (95% uncertainty interval [UI] 5·1–6·2) in 2020, higher than predicted (3·2%, 2·5–4·0). In Belarus, CHE was 13·5% (11·8–15·2) in 2020, also higher than predicted (9·7%, 7·7–11·3). CHE was not different than predicted by past trends in Russia, Peru, and Viet Nam. Between March and April, 2020, health-care visits dropped by 4·6 (2·6–6·5) percentage points in Mexico and by 48·3 (40·6–56·0) percentage points in Peru, and the private share of health-care visits increased by 7·3 (4·3–10·3) percentage points in Mexico and by 20·7 (17·3–24·0) percentage points in Peru. Interpretation In three of the five countries studied, health systems either did not protect people from the financial risks of health care or did not maintain health-care access in 2020, an indication of health systems failing to maintain basic functions. If the 2020 response to the COVID-19 pandemic accelerated shifts to private health-care use, policies to cover costs in that sector or motivate patients to return to the public sector are needed to maintain financial risk protection. Funding The Bill & Melinda Gates Foundation.


Preamble
This appendix provides methodological detail for estimating catastrophic health expenditure and its drivers as described in the main text but also includes associated supplementary analyses.The appendix is organized into broad sections following the structure of the main paper.This study complies with the Guidelines for Accurate and Transparent Health Estimates Reporting (GATHER) recommendations.It includes detailed indicator modelling write-ups and flowcharts, and information on data sourcing to maximize transparency in our estimation processes and provides a comprehensive account of analytical steps.

GATHER Statement
This study complies with the Guidelines for Accurate and Transparent Health Estimates Reporting (GATHER) recommendations. 1 We have documented the steps involved in our analytical procedures and detailed the data sources used in compliance with the GATHER.For additional GATHER reporting, please refer to Supplementary table 1 on pages 6-8.

Data inputs
For all data inputs from multiple sources that are synthesized as part of the study: 3 Describe how the data were identified and how the data were accessed.
Narrative description of data seeking methodology provided.

4
Specify the inclusion and exclusion criteria.Identify all adhoc exclusions.
Narrative about inclusion and exclusion criteria by data type provided in linked materials.

5
Provide information on all included data sources and their main characteristics.For each data source used, report reference information or contact name/institution, population represented, data collection method, year(s) of data collection, sex and age range, diagnostic criteria or measurement method, and sample size, as relevant.
Description of metadata for data sources and links to their sources for public download.

6
Identify and describe any categories of input data that have potentially important biases (eg, based on characteristics listed in item 5).
Summary of known biases included in paper and appendix.
Main Text Methods; Main Text Limitations; Appendix, Part 1, Section 3.For data inputs that contribute to the analysis but were not synthesized as part of the study: 7 Describe and give sources for any other data inputs.
We describe in the text our additional data sources as well as provide links where the data can be downloaded.
Main Text Methods; Appendix, Part 1, Section 2; Appendix Part 1, Section 2. For all data inputs: 8 Provide all data inputs in a file format from which data can be efficiently extracted (eg, a spreadsheet as opposed to a PDF), including all relevant meta-data listed in item 5.For any data inputs that cannot be shared due to ethical or legal reasons, such as third-party ownership, provide a contact name or the name of the institution that retains the right to the data.
We describe in the text our additional data sources as well as provide links where the data can be downloaded.

Data analysis 9
Provide a conceptual overview of the data analysis method.A diagram may be helpful.
We describe step-by-step our process in the methods section of the main text with more details in the appendix.
Main Text Methods; Appendix, Part 1, Section 3. Provide a detailed description of all steps of the analysis, including mathematical formulae.This description should cover, as relevant, data cleaning, data pre-processing, data adjustments and weighting of data sources, and mathematical or statistical model(s).
We describe step-by-step our process in the methods section of the main text with more details in the appendix.Main Text Methods; Appendix, Part 1, Sections 2 and 3. Describe how candidate models were evaluated and how the final model(s) were selected.
We describe step-by-step our process in the methods section of the main text with more details in the appendix.Main Text Methods; Appendix, Part 1 Sections 4, 5, and 6.Provide the results of an evaluation of model performance, if done, as well as the results of any relevant sensitivity analysis.
We describe step-by-step our process in the methods section of the main text with more details in the appendix.Main Text Methods; Appendix, Part 1 Section 6. Describe methods for calculating uncertainty of the estimates.State which sources of uncertainty were, and were not, accounted for in the uncertainty analysis.
We describe step-by-step our process in the methods section of the main text with more details in the appendix.Main Text Methods; Appendix, Part 1 Sections 4, 5, and 6.State how analytic or statistical source code used to generate estimates can be accessed.

Access statement provided.
Main Text, Data Sharing

Results and Discussion
Provide published estimates in a file format from which data can be efficiently extracted.

Section 1a. History of Catastrophic Health Expenditure
The World Bank and the World Health Organization have been reporting the incidence of catastrophic health expenditure (CHE) at the global level since 2015 when the United Nations launched the Sustainable Development Goals (SDGs).SDG Indicator 3.8.2 is formally defined as the proportion of a population with large household expenditures on health as a share of total income or household consumption expenditure, commonly referred to as CHE. 2 SDG Goal 3: Ensure healthy lives and promote well-being for all at all ages.
Target 3.8: Achieve universal health coverage, including financial risk protection, access to quality essential health-care services and access to safe, effective, quality and affordable essential medicines and vaccines for all.
SDG Indicator 3.8.2:Proportion of population with large household expenditures on health as a share of total household expenditure or income

Section 1b. Definitions of indicator
Catastrophic health expenditure (CHE) is a common measure of financial risk protection that occurs when out-ofpocket (OOP) health expenditure exceeds a pre-defined share of household income or household consumption spending.In this case, the thresholds for CHE are 10 and 25 percent of total household expenditure or income to align with SDG Indicator 3.8.2.

Household equation: CHE = 1 if
Health expenditure Total household consumption expenditure > 0.1 or 0.25 Population equation: Section 2a.Sources The present study used two primary types of input data: (1) individual and household level survey microdata; and (2) tabulated data from the Health Equity and Financial Protection Indicator (HEFPI) database.
Microdata for 2020 was publicly available for two countries: Mexico and Peru.In Mexico, the National Survey of Household Income and Expenditure (ENIGH) has been implemented between August and November biennially since 1984. 3In Peru, the National Household Survey on Living Conditions and Poverty (ENAHO) has been conducted year-round since 1995. 4 Tabulated data for 2020 describing CHE were used for Vietnam, Russia, and Belarus.These estimates were derived from long-running surveys in the respective countries.Since 1995, the Belarus Household Survey has been conducted annually and over the course of the year.

Section 2c. Data collection in 2020
Across the world, the COVID-19 pandemic, and responses thereto, disrupted face-to-face data collection.Here we describe how that affected the data we used in our analysis.
Peru.In Peru, the ENAHO is typically conducted all year round and this continued despite the pandemic.In 2020, surveys were conducted face-to-face between January and mid-March, and October and December. 9Between late March and September, however, phone interviews and other alternative methods were used to collect data.The National Institute of Statistics and Informatics in Peru made extensive effort, including many different follow-up methods, to ensure response rates in the phone interviews were similar to the face-to-face surveys.Survey documentation from the Peru ENAHO in 2020 reports that no significant differences were found in household characteristics between those surveyed via phone and in person except for the type of fuel used for cooking.However, significant differences were found in sex and age group between the population surveyed via phone versus in person.We did not find substantial differences in the percent of households in each socioeconomic stratum by type of questionnaire, shown below in Supplementary figure 1.
Belarus, Russia and Vietnam.Detailed documentation on 2020 data collection procedures for the other countries was not available.However, in Belarus, as we note in the discussion in the main text, little-to-no social distancing occurred, which suggests that there is little reason to believe that response rates differed as compared to prior years due to the pandemic.Similarly, Vietnam had minimal disruptions to in-person interaction because COVID-19 cases remained low and so we do not have reason to believe response rates differed in this case as well.We were unable to identify documentation describing the data collection procedures employed in Russia in 2020 and how they may have differed from data collected prior to 2020.

Section 2d. Calculation of CHE
To calculate CHE, we computed the share of consumption expenditure attributed to OOP health expenditure.We then created a binary variable for the CHE thresholds of 10% and 25%, assigning a household 1 if their health expenditure was greater than 10% or 25% of their total consumption expenditure and assigned a 0 otherwise.Using the binary values assigned and survey weights, we calculated the percentage of households with CHE using each threshold.

Section 3b. Calculation of any private healthcare visit
In Mexico, individuals who reported pain, discomfort, illness, or an accident that prevented them from performing daily activities in the past year that received care in a private healthcare setting (private clinics and hospitals, pharmacy).In Peru, individuals who reported symptoms of cough, headache, fever, nausea, other illness such as flu or colitis, relapse of chronic illness, an accident, or symptoms of COVID-19 and received care at a private setting (private doctor's office, private clinic, pharmacy).

Section 3c. Calculation of share of healthcare visits in the private sector
In both Mexico and Peru, the proportion of individuals who had a visit in the private sector was among all individuals who had a healthcare visit by month.In Peru, the month is ascertained from the date of the interview, since the recall period for healthcare visits was the last four weeks.In Mexico, the month is ascertained from the reported date of the healthcare visit.We also calculated the proportion of individuals who had a visit in the private sector excluding pharmacies of all individuals who had a healthcare visit by month, and the proportion of individuals who had a visit at a pharmacy of all individuals who had a healthcare visit by month.We note that there are substantially different patterns in Mexico and Peru when considering the role of private pharmacies separately from other types of private provider, as shown in the figures below.
Difference regressed on lag-difference, differenced first-order model: Where Υ  is the CHE value and  Υ  is the differenced value of the CHE value.Both models were run with and without a constant,  0 .For each model, we extracted 1000 coefficients from the variance-covariance matrix.We combined the 1000 predictions from each model for a total of 4000 predictions and took the 2.5 th and 97.5 th percentiles as well as the mean to construct a point prediction and 95% confidence interval of CHE in 2020.We compared the observed value and the predicted value to note any deviations in CHE.

Section 4a. Model selection
We undertook a model selection process to determine our prediction approach.In addition to the models in equations ( 1) and ( 2) with and without an intercept term, we tested a version of the model with a linear term for time (Models 1 and 6) for both the level and differenced CHE.We tested a differenced model with two-lags (Model 7).
We also tested the (ultimately selected) ensemble model based on using models 2-5.We based on our selection on the normalized out-of-sample root mean square error (RMSE) for the most recent data point prior to 2020 (2019 in Peru and Belarus, 2018 in Mexico and Vietnam and 2014 in Russia).We ran each of the models described below without 2020 and the other most recent prior point, predicted the value of the most recent prior point and compared it to the observed value.The normalized CHE is calculated by dividing the root squared difference between the prediction and observed by the observed CHE value.The normalized RMSE allows us to more easily compare CHE 10% and CHE 25% and more easily compare across countries with substantially different levels of CHE.Across the normalized RMSE results, the ensemble models performs better OOS for CHE (10%) than any individual model.Ultimately, we selected ensemble 2 because it draws from all models with lags and demonstrates good performance in the countries with the highest and lowest data coverage.We note that ensemble 4 out-perform ensemble 2 in terms of normalized OOS RMSE, but the large (non-significant) increase in CHE predicted at the mean in Russia shows that this model does not perform well in both high-and low-data coverage countries.Ensembles 3 and 5 also outperform ensemble 2 for other measures of OOS RMSE in tables 4a and 4b.For this reason, we depict the results of ensembles 2-5 below, which show that the overall findings of a significant change in CHE in Mexico (10% and 25%) and Belarus (10%) and no other countries are robust to the selection of ensembles.Section 5. Interrupted time series analysis of stay-at-home-orders

Supplementary table 4a. Normalized RMSE of twelve models predicting CHE (10% and 25%) with out-ofsample predictions of the most recent year of data available prior to 2020
Our interrupted time series (ITS) approach was used to conduct two separate analyses: one for the healthcare visits per person variable and another for the private share of healthcare visits outcome.In both, we are testing for a shift in the level and trend of the outcome at the time both countries implemented stay-at-home orders in response to the global COVID-19 pandemic.We used distinct transformations of the outcome and specifications for each.Because we have different time frames of data available for the respective countries, slightly different specifications are used in each.
For the following equations,   represents utilization rates or private sector share in 2018, 2019 or 2020 by month;  is an indicator representing April through December 2020;  * ℎ is an interaction between the COVID period and the month of year count.We are interested in  3 , the change in levels when lockdowns commenced, and  4 , the shift in the monthly trends when lockdowns commenced.
Private share visits.To examine the share of healthcare visits in the private sector, we use a simple ITS approach, but we use a slightly different specification in each of the countries.
In Peru, we have a continuous series, spanning 2019-2020.We thus specified the following OLS regression for private share visits in Peru: Where ℎ is a count for month in the series (1-24).
In Mexico, we do not have a continuous series, but instead have data from 2018 to make a stronger estimate of the pre-pandemic trend.For private share visits in Mexico, we estimated the following regression: Where ℎ is a count for month of the year (1-12) and  2020 is an indicator representing the year 2020.The intercept for 2020 ensures that  3 and  4 are not estimating the change in levels between 2018 and 2020.

Healthcare visits per person.
In each of the countries, we used a differenced ITS approach for both all healthcare visits per person and private healthcare visits per person.A differenced approach was required because there is substantial seasonal variation in healthcare visits per person, part of which is due to the method of data collection (recall periods and period of data collection) used in each of the countries.In Peru, we thus divided the healthcare visits per month by the prior year's healthcare visits per month (e.g., January 2020 divided by January 2019).In Mexico, we use the same approach but divide by 2018 rather than 2019, since the most recent year of data is available for 2018 only.We have no reason to believe that seasonal patterns would be substantially different in 2019 versus 2018 in Mexico, but it does affect the interpretation of the results.
Because we have a similar transformation of data and are using the prior years in similar ways, we use the same specification in both Peru and Mexico: Where ℎ is a count for month of the year (1-12), although we are only able to extend the analysis until September 2020 in Mexico.Because data collection spanned August through November, once we extend beyond the midpoint of data collection, we encounter substantial differences due to the varying samples (e.g.data collection did not proceed exactly the same in 2020 versus 2018 -different regions and logistics made the samples distinct, making the results of the differenced healthcare visits per person inconsistent).
We provide the full set of results for these ITS regressions in the two supplementary tables below.We conducted an ecological spatial analysis to assess the association between CHE and its drivers and the burden of COVID-19 at the administrative division of residence in Mexico and Peru.We hypothesized that CHE, OOP expenditure and healthcare use might increase where there are more COVID-19 deaths if COVID-19 care was additional to other healthcare.Alternatively, they might decline where there are more COVID-19 deaths if potential healthcare users stayed away from health facilities -a situation where COVID-19 care may have displaced or repelled other healthcare users.We restricted our analysis to administrative divisions with responses from at least 20 households but conduct sensitivity analyses with other thresholds (shown in the following section).All variables produced from the microdata were averaged by the administrative unit in which the cluster was located using survey weights and then merged with COVID-19 information that was available only at the administrative unit level.Not all administrative units are captured in the analysis because of the sampling in the surveys.Where there are data, there are only one-to-one matches however -we do not conduct a multilevel analysis.

Supplementary table 9a: Interrupted time series regressions, Mexico
We estimated the association between CHE and its drivers at the subnational level with log-transformed COVID-19 deaths per 100,000 population (19) for each municipality in Mexico and each district in Peru (i), controlling for: log-transformed average consumption expenditure per household member (log     ), the proportion of people older than 25 with less than post-secondary education (    ), average age (  ), and the proportion of people living in a rural area (   ), with the following: 4     +  5   +  6    +   Where   is: CHE (10%); CHE (25%); log-transformed average OOP spending per household member; share of respondents using any healthcare; share of respondents using private health care; and the share of healthcare visits that took place in the private sector.
In Mexico, where municipality of residence (n = 2,388) was available for each household, we matched household clusters with Ministry of Health data on COVID-19 hospitalization, case, and death rates by municipality. 11In Peru, where district of residence (n = 1,720) was made available for each household, we matched household cluster data with COVID-19 hospitalization, case, and death rates by district with Ministry of Health data. 12

Section 6b. COVID-19 data sources
We obtained data on COVID-19 metrics from the Secretariat of Health's Historical Open Database 2020 for Mexico and the Ministry of Health's Open Database for Peru on 09/30/2021.There are different recall periods and information on timing available in the two countries' surveys, which resulted in us merging on slightly different time frames in each country.The recall period was 4 weeks for household consumption expenditure, household OOP health spending and CHE (10% and 25%), and 1 year for healthcare utilization in Mexico.The recall period was 4 weeks for household consumption expenditure, household OOP health spending and CHE (10% and 25%), and healthcare utilization in Peru.Spending and CHE measures were annualized for this analysis.To be align with these recall periods after adjustment using the available COVID-19 data, we merged these covariates on cumulative COVID-19 metrics between April and November in Mexico, and between April and December in Peru.
We employed a trimming approach based on household counts by geography to stabilize regression results.We opted to use municipalities and districts with 20 or more households in the final analyses because it permitted us to use the most expansive portion of the data possible (96.8% or 86,154 households in 910 municipalities in Mexico and 71.1% of available observations or 18,459 households in 331 districts in Peru).We present sensitivity analyses that use varying levels of this cutoff in the following section 6e.

Section 6c. Model specifications
Predictor and outcome measures in the COVID-19 regression analyses were calculated for each respondent or household using microdata from Mexico and Peru and then aggregated to each cluster.We calculated household OOP health spending per household member, catastrophic health expenditure (10% and 25%), household consumption per household member, and rurality were measured at the household level.Age and indicators of any healthcare visit and private healthcare visit were measured among all respondents.Post-secondary educational attainment was measured among respondents over 25 and share of healthcare visits in the private sector was measured among respondents with a healthcare visit in the past year in Mexico or the past four weeks in Peru.All predictor and outcome measures were aggregated to the municipality and district levels for Mexico and Peru respectively using survey weights.Covariates were selected based on existing research and theory.

Section 6d. Supplementary results
In Figure 9, we depict the coefficients from the regressions of CHE and its drivers on sub-national variation in the natural log-transformed COVID-19 death rate.We opted to focus on the COVID-19 mortality analysis as our main results due to concerns about missingness in the other COVID-19 metrics.While low case and hospitalizations rates may reflect insufficient testing resources in a municipality or district, we believe that existing national health information systems in Mexico and Peru are better able to capture COVID-19 deaths, although there is likely to be some missingness in these data sources as well. 15We found that, in Mexico, where the COVID-19 death rate was higher, OOP health spending, CHE (10%), CHE (25%), and private sector visits were lower.In Peru, in contrast, we find that OOP health spending, CHE (10%), CHE (25%), healthcare use, and private sector healthcare use were all higher where COVID-19 deaths were higher.Full regression results are in tables 10-17.
In interpreting these results, it is important to note the overall change in healthcare use in the two countries.While healthcare use dropped substantially overall in Peru, the cross-sectional analysis indicates that it fell less in areas where COVID-19 mortality was higher -suggesting that people may have been more likely to leave home for COVID-19 care for other types of care during this period.Strict stay at home orders may have also prevented people from getting care that, in Mexico, was still being pursued.In Mexico, health care use did not drop as substantially.Thus, the finding that healthcare use was lower in areas with higher COVID-19 death rates suggests that people may have put off care in areas where health facilities were treating many COVID-19 patients but were not deterred from health care use in areas where there were fewer deaths from COVID-19, as they may have in Peru.

Section 6e. Sensitivity analyses
In this section, we present results from varying the cutoff of the number of households per cluster in the linear probability model.We depict the regression coefficients and p-values from regressions using different cutoffs for minimum number of households per administrative unit.The direction of the coefficient for our covariate of interest, death rates, does not differ for all dependent variables with the exception of CHE (25%) in Peru.The cutoff does tend to have an influence on the statistical significance of the results.As fewer units are available -by enforcing a higher cutoff -the results tend to no longer be statistically significant.Since the direction of the results stay the same regardless of the cutoff in almost all cases, we interpret the cutoff value as giving us statistical power.
Rates were calculated using the most recent publicly available population data at the municipality and district levels for Mexico and Peru respectively.Population counts for municipalities in 2020 were extracted from the Census of Population and Housing 2020 conducted by National Institute of Statistics and Geography (INEGI) for Mexico; estimated population counts for districts in 2020 were based off the 2017 Peru Population and Housing Census conducted by National Institute of Statistics and Informatics (INEI) 14 .

Supplementary table 2. Microdata availability in input data Country Source Microdata
8atafrom Vietnam originated from the Living Standards Measurement Study, which has collected household expenditure data since 1992.7ForBelarus,we used the CHE estimates reported by the National Statistical Committee of the Republic of Belarus.In Russia and Vietnam, we used CHE estimates reported by the Global Monitoring Report on Financial Protection in Health published by the World Bank and World Health Organization (WHO) in 2021 and made available in the HEFPI database.8 6The Russia Household Budget Survey has been conducted at the household level quarterly and in a continuous cycle by the Federal State Statistics Service since 1952.6

Supplementary figure 4. Predicted and observed values of CHE (10% and 25%) using Ensemble 2 (models 2- 5)
12,13From these data sources, we extracted information on the number of COVID-19 tests, cases, hospitalizations, and deaths.In Peru, COVID-19 metrics were reported by health systems and collected by the National Institute of Health (INS) and the National Center for Epidemiology, Prevention and Disease Control (MINSA).Similarly, health systems reported COVID-19 metrics and the General Directorate of Epidemiology collected these metrics in Mexico.For this analysis, test counts describe all patients who received a COVID-19 test where the laboratory reported the results.Data describing the number of tests was only available for Mexico.Cases consist of the daily record of positive cases of COVID-19 confirmed with any type of test where the patient presented symptoms.Hospitalizations include COVID-19 cases that were admitted to the hospital.Deaths correspond to the total number of people who died in a healthcare facility and tested positive for COVID-19.COVID-19 metrics between April and December in 2020 were aggregated to the district level in Peru; COVID-19 metrics from April through November, the end of ENIGH survey collection period, were aggregated to the municipality level in Mexico.

table 13 .
Binomial model of healthcare expenditure and utilization on COVID-19 hospitalizations

table 16. Linear probability model of healthcare expenditure and utilization on COVID-19 tests Mexico Household OOP health spending Catastrophic health expenditure (10%) Catastrophic health expenditure (25%) Share of individuals with a healthcare visit Share of individuals with a private healthcare visit Share of healthcare visits in the private sector
Supplementary table 16 describes coefficient estimate and standard error.Statistical significance is notated as follows: * p < 0.05; ** p < 0.01 ; *** p , 0.001.Data on test rates was only available for Mexico.

table 17. Binomial model of healthcare expenditure and utilization on COVID-19 tests
Supplementary table 17 describes coefficient estimate and standard error.Statistical significance is notated as follows: * p < 0.05; ** p < 0.01 ; *** p , 0.001.Data on test rates was only available for Mexico.